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Abstract. The EL family of Description Logics (DLs) has been the subject of 
interest in recent years. On the one hand, these DLs are tractable, but fairly inex¬ 
pressive. On the other hand, these DLs can be used for designing different classes 
of ontologies, most notably ontologies from the medical domain. Unfortunately, 
building ontologies is error-prone. As a result, inferable subsumption relations 
among concepts may be unintended. In recent years, the problem of axiom pin¬ 
pointing has been studied with the purpose of providing minimal sets of axioms 
that explain unintended subsumption relations. Lor the concrete case of EL and 
£L + , the most efficient approaches consist of encoding the problem into propo¬ 
sitional logic, specifically as a Horn formula, which is then analyzed with a dedi¬ 
cated algorithm. This paper builds on this earlier work, but exploits the important 
relationship between minimal axioms sets and minimal unsatisfiable subformu¬ 
las in the propositional domain. In turn, this relationship allows applying a vast 
body of recent work in the propositional domain to the concrete case of axiom 
pinpointing for EL and its variants. Lrom a practical perspective, the algorithms 
described in this paper are often several orders of magnitude more efficient that 
the current state of the art in axiom pinpointing for the EL family of DLs. 


1 Introduction 

Axiom pinpointing denotes the problem of computing one minimal axiom set (denoted 
MinA ), which explains a subsumption relation in an ontology [29]. Axiom pinpoint¬ 
ing for different description logics (DLs) has been studied extensively over the last 
decade [29,25,20,5,7,15,34,30,9,31,6,21,24,32], but some of the algorithms used can 
be traced back to the mid 90s [4]. Description of logics of interest have included ACC, 
SHU 7 , SHOIAT, in addition to the EC family of lightweight DLs. More recent work 
has focused on the tractable, albeit fairly inexpressive, EC family of description logics 
(DLs). The reason for this interest is that the EC family of DLs finds important appli¬ 
cations, that include designing of medical ontologies. Besides computing one MinA, 
axiom pinpointing is also concerned with computing all Min As [31,32] or computing 
Min As on demand [7] 1 . 

Original work on axiom pinpointing for the EC family of DLs used the well-known 
labeling-based classification algorithm [8,7] to find all MinAs for the EC family of DLs. 

1 Lollowing existing nomenclature, MinA denotes a single axiom set, whereas MinAs denotes 

multiple axiom sets [7,31]. 



The proposed approach [8,7] generates a worst-case exponential propositional size for¬ 
mula, which is then used for computing all MinAs by finding all the minimal models of 
this formula. More recent work [31,32] proposed an encoding of the problem of com¬ 
puting all MinAs to a propositional Horn formulae. As an important additional result, it 
was shown that the resulting formulas are exponentially more compact than earlier work 
in the worst case. In addition, this work proposed dedicated algorithms for computing 
MinAs, based on propositional satisfiability (SAT) solving, but exploiting techniques 
used in A11SMT algorithms [16]. Although effective at computing MinAs, these ded¬ 
icated algorithms often fail to enumerate all MinAs to completion, or proving that no 
additional MinAs exist. Nevertheless, the practical application of £C + in medical on¬ 
tologies and the need for axiom pinpointing motivate more efficient approaches to be 
developed. 

The main contribution of our work is to show that the computation of MinAs can 
be related with the extraction of minimal unsatisfiable subformulas (MUS) of the Horn 
formula encoding proposed in earlier work [31,32]. More concretely, a subformula of 
this encoding is an MUS if and only if it represents one MinA [31,32], Although this 
connection is straightforward in hindsight, we point out that it had not been investigated 
since the work was first published in 2009 [31]. The relationship between MUSes and 
MinAs allows tapping on the large recent body of work on extracting MUSes, but also 
on Minimal Correction Subsets (MCSes), as well as their minimal hitting set relation¬ 
ship [28,13,10,18,11,17,26,19], which for the propositional case allows exploiting the 
performance of modern SAT solvers. The relationship also allows exploring the vast 
body of recent work on solving maximum satisfiability (MaxSAT) [1,22] and on enu¬ 
merating MaxSAT solutions [23], The main practical consequences of this insight is 
that by exploiting the Horn formulae encoding proposed in earlier work [31,32] we are 
able to compute the set of MinAs for the vast majority of existing problem instances, 
and most often with many orders of magnitude performance improvements over what 
is currently the state of the art [31,32], 

The paper is organized as follows. Section 2 overviews essential definitions and 
introduces the notation used throughout the paper. Section 3 overviews existing work 
on axiom pinpointing for the EC family of DLs. Section 4 briefly overviews recent 
work on MUS enumeration, detailing the approach used in the paper. Section 5 relates 
SAT-based axiom pinpointing in £C + with MUS extraction and enumeration for propo¬ 
sitional formulae, and summarizes the organization of a new £C + axiom pinpointing 
tool, EL2MCS. Experimental results on well-known problem instances are analyzed 
in Section 6. Finally, the paper concludes in Section 7. 


2 Preliminaries 


This section introduces the notation and background material used throughout the paper, 
both related with description logics, namely the £C family of DLs, and with proposi¬ 
tional satisfiability. Section 2.1 briefly overviews £C + . Afterwards, Section 2.2 sum¬ 
marizes work on axiom pinpointing in £C + . Finally, Section 2.3 reviews SAT-related 
definitions. 
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Syntax 

Semantics 

top 

T 

A T 

conjunction 

An Y 

x 1 n Y 1 

existential restriction 

3r.X 

{x G A 1 y £ A 1 : (x, y) £ r x A y G X 1 } 

general concept inclusion 

X C Y 

X 1 C Y x 

role inclusion 

ri 0 ■ ■ ■ 0 r n C s 

rf 0 • ■ ■ 0 r x C s z 


Table 1: Syntax and semantics of £C + 


2.1 Lightweight Description Logic £C + 

This section follows standard descriptions of £C + [7,31], ££ + belongs to the £C fam¬ 
ily of lightweight description logics. Starting from a set Nc of concept names and a 
set Nr of role names, concept descriptions in ££ ' are defined inductively using the 
constructors shown in the top part of Table 1 . (As standard in earlier work, uppercase 
letters X, X t , Y, Yi denote generic concepts, uppercase letters C, Ci, D, I) L , E, Ei 
denote concept names and lowercase letters r, s denote role names.) A TBox (or 
ontology) in £C + is a finite set of general concept inclusion (GCI) and role inclusion 
(RI) axioms. The syntax of GCIs and RIs is shown in the bottom part of Table 1. For a 
TBox T, PC 7 - denotes the set of primitive concepts of T, representing the smallest set 
of concepts that contain: (i) the top concept T; and (ii) all the concept names used in T. 
PR 7 - denotes the set of primitive roles of T, representing the set of all role names used 
in T. In addition, and for convenience, X = Y corresponds to the two GCIs ICY 
and Y C A'. Finally, and throughout the paper, A denotes a set of assertions. 

The semantics of £C + is defined in terms of interpretations. An interpretation I is a 
tuple (A x , x ), where A 1 represents the domain, i.e. a non-empty set of individuals, and 
the interpretation function 1 maps each concept name C G Nc to a set C 1 G A x , and 
maps each role name r G Nr to a binary relation r 1 defined on A x , i.e. r x C A x x A 1 . 
The third column of Table 1 details the inductive definitions of 1 for arbitrary concept 
descriptions. An interpretation X is a model of a TBox T if and only if the conditions 
in semantics (third) column of Table 1 are respected for every GCI and RI axiom in T. 

The main inference problem for £C + is the subsumption problem: 

Definition 1 (Concept Subsumption). Let C, D represent two ££ + concept descrip¬ 
tions and letT represent an £C + TBox. C is subsumed by D w.r.t. T (denoted C Er D) 
if C x C D x in every model X ofT- 

Subsumption algorithms assume that a given TBox T is normalized [3]. Normal¬ 
ization can be viewed as the process of breaking complex GCIs into simpler ones. It 
is well-known that a TBox T can be normalized in linear time [3]. Moreover, given a 
normalized TBox, concept subsumption can be determined in polynomial time [3]. 

Theorem 1 (Theorem 1 in [7]). Given a TBox T in normal form, the subsumption 
algorithm runs in polynomial time on the size ofT~. 
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{Endocarditis C Inflammation n 3hasLoc.Endocardium, 
Inflammation C Disease n 3actsOn.Tissue, 

Endocardium C Tissue n 3contIn.HeartValve, 

7mg := HeartValve C 3contIn.Heart, 

HeartDisease = Disease n 3hasLoc.Heart, 
contln o contln C contln, 
hasLoc o contln C hasLoc} 

{Endocarditis C Inflammation, 

Inflammation C Disease, 

Endocardium C Tissue, 

HeartDisease C Disease, 

Endocarditis C BhasLoc.Endocardium, 

Inflammation C 3actsOn.Tissue, 

Endocardium C 3contIn. Heart Valve, 

HeartValve C 3contIn.Heart, 

HeartDisease C 3hasLoc.Heart, 

,4mg : = Disease n N C HeartDisease, 

3hasLocHeart C N, 

Endocarditis C Disease, 

Endocarditis C 3actsOn.Tissue, 

Endocarditis C 3hasLoc. Heart Valve, 

Endocardium C 3contIn.Heart, 

HeartDisease C N, 

Endocarditis C 3hasLocHeart N, 

Endocarditis C N, 

Endocarditis C HeartDisease} 

Table 2: An example 7mg medical ontology and its classification. 

A classification of TBox T represents all subsumption relations between concept 
names in T. Similarly to subsumption, and because of the worst-case quadratic number 
of subsumption relations, classification can be determined in polynomial time on the 
size of T [3,7], 

Example 1 (GALEN-based Medical Ontology Example). Let us consider an £C " med¬ 
ical ontology adapted from the GALEN medical ontology [27], and shown in Table 2. 
The ontology expresses a medical condition in which endocarditis is classified as a 
heart disease (i.e.. Endocarditis C HeartDisease). This disease occurs due to a bacte¬ 
ria that damages endocardium, a tissue, that provides a protection to the heart valves. 
The table shows both the ontoloy and its classification. The normalized ontology con¬ 
sists of 11 GCI’s and 2 role inclusion axioms (i.e., contln o contln C contln and 
hasLoc o contln C hasLoc). Only assertions are added to the assertion axiom set _4 mg. 
Regarding the classification of this ontology, N denotes the reference to the assertion 
axiom Endocarditis C 3hasLocHeart, and serves to simplify the notation of assertion 
axioms set. 
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2.2 Axiom Pinpointing in EC + 

Axiom pinpointing is the problem of explaining unintended subsumption relations in 
description logics. As is the case in so many other areas of research, the goal of axiom 
pinpointing is to find minimal explanations for unintended subsumption relations. The 
importance of this research topic is illustrated by the large body of work, targeting 
different description logics [29,25,20,5,7,15,34,30,9,31,6,21,32], For the concrete case 
of the EC family of DLs, two main approaches have been proposed [5,7,31,32], 

Earlier work [5,7] consists in creating a pinpointing formula 0, and then enumerat¬ 
ing the minimal models of </>. This is the approach implemented in the CEL tool [5]. The 
main drawback of this work is that the pinpointing formula 0 is worst-case exponential 
on the size of T, and enumeration of minimal models is NP-hard. In contrast, more 
recent work showed that concept subsumption can be encoded to a Horn formula, and 
that the axiom pinpointing problem can be solved on this Horn formula [31,32]. This 
approach is detailed in Section 3. 

Throughout this paper, the following standard definition of MinA (and of nMinA) 
is assumed. 

Definition 2 (nMinA/MinA). Let T be an EC + TBox, and let C. 1) £ PC 7 - be primi¬ 
tive concept names, with C C 7 - D. Let S be a subset ofF be such that C Cg D. If S 
is such that C C 5 D and C D for S' C S, then S is a minimal axiom set (MinA) 
w.r.t. C fp D. Otherwise, S is a non-minimal axiom set (nMinA) w.r.t. C C 7 - D. 

2.3 Propositional Satisfiability 

Standard propositional satisfiability (SAT) definitions are assumed [12]. This includes 
standard definitions for variables, literals, clauses and CNF formulas. Formulas are rep¬ 
resented by F, M, M', C and C', but also by 0 and 0. Horn formulae are such that 
every clause contains at most one positive literal. It is well-known that SAT on Horn 
formulae can be decided in linear time [14]. The paper explores both MUSes and MC- 
Ses of propositional formulae. 

Definition 3 (MUS). M. C F is a Minimal Unsatisfiable Subformula (MUS) of F iff 
A4 is unsatisfiable andfjvucM is satisfiable. 

Definition 4 (MUS). C C F is a Minimal Correction Subformula (MCS) ofF iffF\C 
is satisfiable andfc'cc F\C' is unsatisfiable. 

A well-known result, which will be used in the paper is the minimal hitting set 
relationship between MUSes and MCSes of an unsatisfiable formula F [28,13,10,18], 

Theorem 2. Let F be unsatisfiable. Then, 

1. Each MCS of F is a minimal hitting set of the MUSes of F. 

2. Each MUS of F is a minimal hitting set of the MCSes of F. 

As highlighted in Section 4, Theorem 2 forms the basis of all existing approaches for 
enumerating MUSes [10,18,17,26], Moreover, the importance of set duality relation¬ 
ships in combinatorial optimization is highlighted by recent work [35], 
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3 SAT-Based Axiom Pinpointing in ££ + 


This section reviews EL2SAT, a recently proposed approach for axiom pinpointing of 
£C + ontologies using a Horn formula encoding [31,32]. The EL2SAT approach can be 
divided into two main components. First, the axiom pinpointing problem is encoded as 
a Horn formula which is polynomial on the size of T- Second, the MinAs are computed 
using a dedicated algorithm, exploiting ideas from early work on A11SMT [16], 

The EL2SAT axiom pinpointing approach for £C + can be summarized as fol¬ 
lows [31,32]: 

A. Encode the classification of TBox T by running the standard classification algo¬ 
rithm and adding Horn clauses for representing every non-trivial axiom or assertion 
as a set of Horn clauses. The resulting Horn formula is denoted (b-j. 

B. Encode the complete classification DAG of the input normalized ontology T as a 
Horn formula qb’by. 

C. Associate a Boolean variable S[ a .j with each assertion a, in the classification of 
T and create a clause S[ a .] —>■ ££ + 2SAT(ai), where ££ + 2SAT(ai) is the set of 
clauses associated with a,; in fj- 1 ■ The resulting formula is denoted 4>t(so)- 

D. Encode the pinpointing-only problem as another Horn formula . For the pur¬ 

poses of this paper, this is the formula that is of interest and so the one analyzed in 
greater detail. 

The construction of the target Horn formula 4 , p\ po ' j mimics the construction of the other 
formulas, in that the classification procedure is executed. The formula is constructed as 


follows: 


1. For every RI axiom, create an axiom selector variable sr a .i. For trivial GCI of the 
form C C C or C C T, S[ a .j is constant true. For each non-trivial GCIs, add an 
axiom selector variable S[ ai j. 

2. During the execution of the classification algorithm, for every application of a rule 
(concretely r) generating some assertion gen(r) (concretely a^), add to (pp\ po \ a 
clause of the form. 



( 1 ) 


where sr a .j is the selector variable for a, and ant(r) are the antecedents of a n with 
respect to rule r. 

As explained in earlier work, the encoding procedure ensures that each rule application 
is applied only once. Finally, for the concrete case of axiom pinpointing, specify the 
assumption list {-iS[cr i iz.D i ]} U {s[ 0i j | a* €E T}. This assumption list is manipulated by 
the dedicated algorithm proposed in earlier work [31,32], 

The following theorem is fundamental for earlier work [31,32], and is extended in 
the next section to related MinAs with MUSes of propositional formulae. 

Theorem 3 (Theorem 3 in [32]). Given an £C + TBox T,for every S C T and for ev¬ 
ery pair of concept names C 1 D € PC 7 -, C C 5 D if and only if the Horn propositional 
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formula, 


K\ po) /\axi€.S (^[axi ]) A ( _lS [CCD]) 


( 2 ) 


is unsatisfiable. 

Example 2 (GALEN Medical Ontology SAT Encoding). For the ontology in Exam¬ 
ple 1, Table 3 illustrates the construction of the formulas described earlier in this sec¬ 
tion. Regarding the sr a .] variables, the original set is {si,..., S 13 }. These represent the 
axioms in the normalized ontology. The remaining ones result from the classification 
procedure. It it important to note that we do not expand c c £ + 2SAT (a,)) in 0'n,, G(linj ■ 
Moreover, a selection variable is assigned to each role inclusion axiom. 


4 Enumeration of MUSes 

The problem of enumerating all the MUSes of an unsatisfiable formula has been studied 
in different settings [28,13,10,18,17,26], starting with the seminal work of Reiter [28]. 

Although the enumeration of MCSes can be achieved in a number of different 
ways [19], the enumeration of MUSes is believed to be far more challenging. Intuitively, 
the main difficulty is how to block one MUS and then use a SAT solver (or some other 
decision procedure) to compute the next MUS. This difficulty with MUS enumeration 
motivated a large body of work exploiting a fundamental relationship between MUSes 
and MCSes. Indeed, it is well-known (e.g. see Theorem 2) that MCSes are minimal 
hitting sets of MUSes, and MUSes are minimal hitting sets of MCSes [28,13,10,18], 
As a result, early approaches [10,18] for enumeration of MUSes were organized as fol¬ 
lows: (i) enumerate all MCSes of a formula; (ii) compute the minimal hitting sets of the 
MCSes. 

One problem with early approaches is that all MCSes need to be enumerated be¬ 
fore the first MUS is computed. As a result, more recent work proposed alternative 
approaches which enable both MUSes and MCSes to be computed while enumeration 
of MUSes (and MCSes) takes place [17,26], Nevertheless, if the number of MCSes is 
manageable, earlier work is expected to be more efficient [18]. It should noted that, all 
existing approaches for MUS enumeration relate with Theorem 2, in that enumeration 
of MUSes is achieved by explicitly or implicitly computing the minimal hitting sets of 
all MCSes. 

Regarding MCS enumeration, two main approaches have been studied. One exploits 
MaxSAT enumeration [18,23]. More recent work proposes dedicated MCS extraction 
algorithms, also capable of enumerating MCSes [19]. The approach proposed in this 
paper exploits MaxSAT enumeration. 


5 Axiom Pinpointing with MUS Extraction 

This section shows that the computation of MinAs can be related with MUS enumera¬ 
tion of the Horn formula (t>^ po y It then briefly overviews existing approaches for MUS 
enumeration, and concludes by summarizing the MUS enumeration approach used in 
this paper. 
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51 —► Endocarditis C Inflammation 

5 2 —> Inflammation C Disease 

5 3 —> Endocardium C Tissue 

5 4 —► HeartDisease C Disease 

5 5 —> Endocarditis C BhasLoc.Endocardium 
Sq —► Inflammation C 3actsOn.Tissue 

S 7 —>■ Endocardium C 3contIn. Heart Valve 
ss —► HeartValve Cl 3contIn.Heart 
sg -A HeartDisease C 3hasLoc.Heart 
S 10 -A Disease FI N C HeartDisease 
^7 mg(«o) := s n ~► 3hasLocHeart C N 

5 12 —> contln o contln C contln 

5 13 —> hasLoc o contln L hasLoc 

5 14 —> Endocarditis C Disease 

5 15 —> Endocarditis C 3actsOn.Tissue 

51 6 —> Endocarditis C 3hasLoc.HeartValve 

5 17 —> Endocardium C 3contIn.Heart 
sis —> HeartDisease C N 

5 19 —> Endocarditis C 3hasLocHeart <— N 

5 20 —> Endocarditis C N 

s 2 i —> Endocarditis L HeartDisease 
{si As 2 -> S 4 , 

S6 A Si —> S15, 

S13 A S7 A S5 —)• Si6, 

512 A Sg A S7 —> S17, 

S11 A Sg —> S 18, 

^rLipo) := S 13 A s 16 A S 8 -a S 19 , 

513 A Si7 A S5 -A Si9, 

Sn A Sig -A S20, 

S10 A S14 A S20 -A S21, 
s 4 A S21 -A Si4, 

Sg A S21 ~A Sig} 

~ ^tLo( po) A Al<j<13( Si ) A (-S 21 ) 

_ MinA := {si, s 5 , s 8 , s i0 , s n , Si 3 } _ 

Table 3: The 7mg Horn encoding [31,32]. 
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5.1 MinAs as MUSes 


Although not explicitly stated, the relation between axiom pinpointing and MUS extrac¬ 
tion has been apparent in earlier work [7,31,32], Indeed, from the encoding of axiom 
pinpointing to Horn formula, it is immediate the following result, that relates Theorem 3 
(Theorem 3 in [32]) with the dedicated algorithm proposed in earlier work [31,32]: 

Theorem 4. Given an £C 4 TBox T, for every S C 7” and for every pair of concept 
names C,D £ PCp, S is a MinA of C C 5 D if and only if the Horn propositional 
formula, 

<(po) ^axi^S ])A(- , 5 [CCD]) ( 3 ) 

is minimally unsatisfiable. 

Proof [Sketch] By Theorem 3, C fs D if and only if the associated Horn formula (3) 
is unsatisfiable. For a MinA S C T, minimal unsatisfiability of (3) (with T replaced 
by S ) results from the MinA computation algorithm proposed in earlier work [31,32], □ 

Based on Theorem 4 and the MUS enumeration approaches summarized in Sec¬ 
tion 4, we can now outline our approach based on MUS enumeration. 

Earlier work [31,32] explicitly enumerates assignments to the sr aXi i variables in a 
AllSMT-inspired approach [16], In contrast, our approach is to model the problem has 
partial maximum satisfiability (MaxSAT), and enumerate the MUSes of the MaxSAT 
problem formulation. 

All clauses in <^( po ) are declared as hard clauses, i.e. they must be satisfied. Ob¬ 
serve that, by construction, 4>p\ po ) is satisfiable. In addition, the constraint C Ur D 
is encoded with another hard clause, namely ( _| S[ciz t d]). Finally, the variable associ¬ 
ated with each axiom axi, sr a2 . ; ] denotes a unit soft clause. The intuitive justification is 
that the goal is to include as many axioms as possible, leaving out a minimal set which 
if included will cause the complete formula to be unsatisfiable. Thus, each of these 
sets represents an MCS of the MaxSAT problem formulation, but also a minimal set of 
axioms that needs to be dropped for the subsumption relation not to hold. MCS enu¬ 
meration can easily be implemented with a MaxSAT solver [18,23] or with a dedicated 
algorithm [19], Moreover, we can now use minimal hitting set dualization [28,13,10,18] 
to obtain the MUSes we are looking for, starting from the previously computed MCSes. 
This is the approach implemented in this paper. 

Example 3. For the ontology in Example 1 and Example 2, Table 4 summarizes the 
MaxSAT formulation, as well as an example MUS. The hard clauses are given by 
<^T( P o) anc * ky ( _lS [Ccr>]- Each positive unit soft clauses is given by the selection vari¬ 
able associated with each of the original axioms. The EL2MCS approach starts by com¬ 
puting the MCSes, using a MaxSAT-based approach, and then computes the MUSes by 
minimal hitting set dualization (e.g. see Theorem 2). 

Related Work. Reiter’s work on computing minimal hitting sets [28] is used to find 
all the justifications in earlier work [15,37], which exploits hitting set trees (HST). 
This approach starts by computing a single justification by using either a standard 
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</>« 1} 

4 >S '■= {«1, S 2 , . . . , S13} 

ft ■= {<hu<l>s) 

MUS := {si, S 5 , sg, sio, sn, S 13 } 

Table 4: The MaxSAT formulation. 



Fig. 1: The EL2MCS tool 


blackbox method, either a sliding window deletion based-deletion algorithm (SIN¬ 
GLE JUST ALG in [15]) or a binary search based algorithm (log-extract-mina in [9]). 
After finding the first justification using any of these methods, the algorithm removes 
axioms one by one from this justification set and constructs a Hitting Set Tree (HST) 
that in turn serves to find all the justifications (using either SINGLE JUST_ALG or 
log-extract-mina on each branch of the HST). The main drawback of this approach is 
that it does not scale for large size ontologies. Therefore, this method is only used in 
conjunction with reachability-based modules [9] and when a TBox contains only a few 
relevant axioms subject to an entailment. 

5.2 The EL2MCS Axiom Pinpointing Approach 

This section summarizes the organization of the EL2MCS tool, that exploits MUS enu¬ 
meration for axiom pinpointing. The organization of EL2MCS is shown in Figure 1. 
The first step is similar to EL2SAT [31,32] in that a propositional Horn formula is gen¬ 
erated. The next step, however, exploits the ideas in the previous section, and generates 
a partial MaxSAT encoding. As outlined earlier, we can now enumerate the MCSes of 
the partial MaxSAT formula. This is achieved with the CAMUS2 tool [19] 2 3 . The final 
step is to exploit minimal hitting set dualization for computing all the MUSes given 
the set of MCSes [18]. This is achieved with the CAMUS tool '. It should be observed 
that, although MCS enumeration uses CAMUS2 (a modern implementation of the MCS 
enumerator in CAMUS [18], capable of handling partial MaxSAT formulae), alternative 
MCS enumeration approaches were considered [19], not being as efficient. 

6 Experimental Results 

The experiments were performed on an HPC cluster, with dual quad-core Intel Xeon 
3GHz processors, with 32GB of physical memory. The tools EL2SAT and EL2MCS 

2 Available from http://logos.ucd. ie/web/doku . php?id=mcsls. 

3 Available from http : //sun. iwu. edu/ ~mliffito/camus/. 
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Fig. 2: Cactus plot comparing EL2MCS and EL2SAT on all problem instances, with COI reduc¬ 
tion 


were given a timeout of 3600 seconds and a memory limit of 16GB. EL2MCS uses 
the partial MaxSAT instances generated with the help of EL2SAT Horn propositional 
encoding tool 4 . The medical ontologies used in the experimentation are GALEN [27], 
Gene [2], NCI [33] and SNOMED-CT [36] 5 . We have used the 450 subsumption query 
instances which are studied in earlier work [32] (and which are available from the 
EL2SAT website). The experimental results compare exclusively EL2SAT and EL2MCS, 
since EL2SAT has recently been shown to consistently outperform CEL [5], a £C + 
axiom pinpointing tool [32], Moreover, unless otherwise stated, the experiments con¬ 
sider the optimizations proposed in earlier work [31,32], and exploited in the EL2SAT 
tool. One example is the cone-of-influence (COI) simplification technique [32], which 
enables significant reductions in the obtained propositional satisfiability instances. It 
should be noted that cone-of-influence reduction technique can be related with £C + 
reachability-based modularization [9]. 


4 The £C + to SAT encoder is denoted el2sat_all, whereas the axiom pinpointing tool is 
el2sat_all_mins. These tools are available from http://disi.unitn.it/~rseba/ 
el sat/. Moreover, this site also contains the subsumption query instances used in the ex¬ 
periments. 

5 GENE, GALEN and NCI ontologies are freely available at http://lat.inf. 
tu-dresden.de/~meng/toyont.html. The SNOMED-CT ontology was requested 
from IHTSDO under a nondisclosure license agreement. 
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Fig. 3: Scatter plot comparing EL2MCS and EL2SAT on all problem instances, with COI reduc¬ 
tion 


Figure 2 shows a cactus plot comparing EL2SAT with EL2MCS. The performance 
gap between EL2SAT and EL2MCS is clear and conclusive. As summarized in Ta¬ 
ble 5, EL2SAT solves 241 out of 450 instances, whereas EL2MCS solves 448 out of 
450 instances. Figure 3 shows a scatter plot comparing the two tools. As before, the 
performance gap between EL2SAT and EL2MCS is clear, with performance gains that 
often exceed one order of magnitude, and that can even exceed three orders of magni¬ 
tude. More significantly, for 207 instances (i.e. 209—2 out of 450), and in contrast with 
EL2MCS, EL2SAT does not terminate within the given timeout. 

Table 5 summarizes the statistics of running the two tools with and without COI 
reduction. As can be concluded, COI reduction is far more relevant for EL2SAT than 
for EL2MCS. When COI reduction is used, EL2SAT outperforms EL2MCS on 16.9% 
of the instances. It should be noted that all of these instances terminate in less that 
0.5 seconds for both tools, as can be concluded from Figure 3. In contrast, EL2MCS 
outperforms EL2SAT on 74.9% of the instances, and for 207 (i.e. 209 — 2) of these, 
EL2SAT does not terminate. The statistics further support the conclusion that the ex¬ 
traction of MUSes provides a far more robust solution than a dedicated algorithm based 
on enumeration of subsets and exploiting A11SMT techniques. 

Table 6 summarizes the number of times each tool computes more MUSes than 
the other tools, independently of being able to prove the non-existence of additional 
MUSes within the given timeout. As can be observed, EL2SAT is actually quite capa- 
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EL2SAT 

EL2MCS 

# Solved 

241 

448 

% Solved 

53.6% 

99.6% 

# TO 

209 

2 

% TO 

46.4% 

0.4% 

% Wins 

16.9% 

74.9% 



EL2SAT 

EL2MCS 

# Solved 

8 

447 

% Solved 

1 .8% 

99.3% 

# TO 

442 

3 

% TO 

98.2% 

0.7% 

% Wins 

0 .2% 

99.3% 


(a) With COI reduction (b) Without COI reduction 

Table 5: Statistics summarizing the results on a universe of 450 problem instances with a timeout 
of 3600s, with and without COI reduction. 


EL2SAT 

EL2MCS 

Ties 

2 

19 

429 


EL2SAT 

EL2MCS 

Ties 

3 

71 

376 


(a) With COI reduction (b) Without COI reduction 

Table 6: Number of times a tool computes more MUSes within a timeout of 3600s, with and 
without COI reduction. 


ble of finding most MUSes, especially when COI reduction is applied. The reason is 
how the algorithm is implemented and the preference to promot conflicts. However, as 
shown in Figure 2 and Figure 3, in many cases EL2SAT takes significantly more time 
to compute all MUSes, and it is often unable to prove the non-existence of additional 
MUSes. These results also indicate that MUSes in the axiom pinpointing instances con¬ 
sidered are usually small. As a result, EL2SAT, which uses assumptions for the implicit 
set enumeration of target sets, is able to compute most MUSes in many cases. In con¬ 
trast, in situations where the size of MUSes is larger, EL2SAT would be expected to be 
unable to reach a stage where the sets representing MUSes would be enumerated. 

7 Conclusions & Future Work 

Axiom pinpointing serves to identify unintended subsumption relations in DLs. For the 
EC family of DLs, there has been recent work on axiom pinpointing, the most efficient 
of which is based on encoding the problem to propositional Horn formulae. The main 
contribution of this paper is to relate axiom pinpointing with MUS extraction. As a 
result, this enables exploiting different MUS extraction and enumeration algorithms for 
the problem of axiom pinpointing. Preliminary results, obtained using off-the-shelve 
tools, show categorical performance gains over the current state of the art in axiom 
pinpointing for the EC family of DLs. 

The results are justify further analysis of the uses of MUSes and MCSes in axiom 
pinpointing, for the EC family of DLs, but also for other DLs. 
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